Perceptual aspects of voice-source parameters

نویسندگان

  • Ralph van Dinther
  • Armin Kohlrausch
  • Raymond Veld
چکیده

Both in speech synthesis and in sound coding it is often beneficial to have a measure that predicts whether, and to what extent, two sounds are different. This chapter addresses the problem of estimating the perceptual effects of small modifications to the spectral envelope of a harmonic sound. A recently proposed auditory model is investigated that transforms the physical spectrum into a pattern of specific loudness as a function of critical band rate. A distance measure based on the concept of partial loudness is presented, which treats detectability in terms of a partial loudness threshold. This approach is adapted to the problem of estimating discrimination thresholds related to modifications of the spectral envelope of synthetic vowels. Data obtained from subjective listening tests using a representative set of stimuli in a 3IFC adaptive procedure show that the model makes reasonably good predictions of the discrimination threshold. Systematic deviations from the predicted thresholds may be related to individual differences in auditory filter selectivity. The partial loudness measure is compared with previously proposed distance measures such as distances between excitation patterns and between specific loudnesses applied to the same experimental data. An objective test measure shows that the partial loudness measure and the distance between two excitation patterns are equally appropriate as distance measures for predicting audibility thresholds. The distance between two specific loudnesses is worse in performance compared with the other two. This chapter is based on Rao et al. (2001) 8 Discrimination of spectral envelope distortions

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Session 4aSCb: Voice and F0 Across Tasks (Poster Session) 4aSCb7. A perceptually and physiologically motivated voice source model

Many glottal source models have been proposed, but none has been systematically validated perceptually. Our previous work showed that model fitting of the negative peak of the flow derivative is the most important predictor of perceptual similarity to the target voice. In this study, a new voice source model is proposed to capture perceptually-important source shape aspects. This new model, alo...

متن کامل

A perceptually and physiologically motivated voice source model

Many glottal source models have been proposed, but none has been systematically validated perceptually. Our previous work showed that model fitting of the negative peak of the flow derivative is the most important predictor of perceptual similarity to the target voice. In this study, a new voice source model is proposed to capture perceptually-important source shape aspects. This new model, alo...

متن کامل

An HMM-based speech synthesiser using glottal post-filtering

Control over voice quality, e.g. breathy and tense voice, is important for speech synthesis applications. For example, transformations can be used to modify aspects of the voice related to speaker’s identity and to improve expressiveness. However, it is hard to modify voice characteristics of the synthetic speech, without degrading speech quality. State-of-the-art statistical speech synthesiser...

متن کامل

Immediate effects of vocal warm-up exercises on elementary teachers' voice

Introduction: Teachers are a large group of professional voice users who are exposed to many voice problems. Vocal warm-up exercises (VWUE) can prepare the muscles involved in vocalization before teaching and can reduce voice damage in teachers. However, limited studies have examined the effects of VWUE on teachers' voices. Therefore, the present study was conducted to investigate the immediate...

متن کامل

Vocal quality factors: analysis, synthesis, and perception.

The purpose of this study was to examine several factors of vocal quality that might be affected by changes in vocal fold vibratory patterns. Four voice types were examined: modal, vocal fry, falsetto, and breathy. Three categories of analysis techniques were developed to extract source-related features from speech and electroglottographic (EGG) signals. Four factors were found to be important ...

متن کامل

ویژگی های اکوستیکی و ادراکی صوت در LP: گزارش دو مورد

Background and Purpose: Lipoid Proteinosis (LP) is a rare hereditary progressive disorder caused by a disorder of collagen metabolism. In LP, hyaline deposits in mucous membrane of true vocal folds causes hoarseness. The studies on the laryngeal and voice features of patients with LP are rare. To the best of our knowledge, this is the first research to study acoustic and perceptual voice feat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003